Metabarcoding reveals unique microbial mat communities and evidence of biogeographic influence in low‐oxygen, high‐sulfur sinkholes and springs

Abstract High‐sulfur, low‐oxygen environments formed by underwater sinkholes and springs create unique habitats populated by microbial mat communities. To explore the diversity and biogeography of these mats, samples were collected from three sites in Alpena, Michigan, one site in Monroe, Michigan, and one site in Palm Coast, Florida. Our study investigated previously undescribed eukaryotic diversity in these habitats and further explored their bacterial communities. Mat samples and water parameters were collected from sulfur spring sites during the spring, summer, and fall of 2022. Cyanobacteria and diatoms were cultured from mat subsamples to create a culture‐based DNA reference library. Remaining mat samples were used for metabarcoding of the 16S and rbcL regions to explore bacterial and diatom diversity, respectively. Analyses of water chemistry, alpha diversity, and beta diversity articulated a range of high‐sulfur, low‐oxygen habitats, each with distinct microbial communities. Conductivity, pH, dissolved oxygen, temperature, sulfate, and chloride had significant influences on community composition but did not describe the differences between communities well. Chloride concentration had the strongest correlation with microbial community structure. Mantel tests revealed that biogeography contributed to differences between communities as well. Our results provide novel information on microbial mat composition and present evidence that both local conditions and biogeography influence these unique communities.


| INTRODUC TI ON
Areas of karst geology are found throughout the Laurentian Great Lakes (Biddanda et al., 2006) and Florida (Barrios, 2006).In these regions, high-sulfur, low-oxygen groundwater dissolves surrounding bedrock and is released at the surface through springs and sinkholes (Biddanda et al., 2006).These conditions produce harsh environments that are often dominated by microbial mats (Franks & Stolz, 2009;Voorhies et al., 2012), which are thin, horizontally stratified layers of microbes over the sediment (Stal, 1995).While microbial mats can be found in a variety of habitats today, they are often the only life form able to tolerate the conditions in extreme aquatic environments (Prieto-Barajas et al., 2018).Microbial mats are of interest because they are analogous to the communities that lived in ancient seas and contributed to the oxygenation of Earth's atmosphere (Dick et al., 2018).
Such sites include sulfur springs, hot springs, and Antarctic lakes, providing similar habitats to those where Earth's most ancient life was found (Allwood, 2016).Modern microbial mats employ a variety of metabolic strategies to ensure their survival under these unique conditions (Biddanda et al., 2023;Canfield & Des Marais, 1993).
Cyanobacteria and sulfur-oxidizing bacteria are dominant components of these microbial mat communities (Biddanda et al., 2012;Franks & Stolz, 2009;Stal, 1995).Diatoms, another primary producer, are the most common eukaryotes in some mat communities (Gomez et al., 2018;Perillo et al., 2022;Pinckney et al., 1995).In Middle Island Sinkhole (MIS), a submerged sulfur spring sinkhole in Michigan, motile taxa from microbial mats contribute to a complex, three-dimensional mat structure featuring diurnal shifts in position to utilize a variety of metabolic strategies and exploit changing resources, truly a syntrophic system (Biddanda et al., 2015(Biddanda et al., , 2023)).In this habitat, cyanobacterial filaments dominate the top of the mat community during the day to exploit sunlight for photosynthesis, and Craticula cuspidata, a motile diatom, migrates vertically through the mats to store nitrogen in the absence of light for nitrogen respiration, giving them an advantage over non-motile organisms in this environment (Biddanda et al., 2023;Merz et al., 2021).Archaea are found in microbial mat communities as well, particularly in the underlying sediment where their primary role may be methanogenesis (Nold, Pangborn, et al., 2010).
The isolated, unique conditions found in these types of spring habitats, along with their usually depauperate flora, present ideal circumstances for investigating microbial biogeography (Power et al., 2018).Biogeography, once expected to have minimal influence on microbial community structure (e.g., Bass Becking, 1934;Finlay & Fenchel, 2004), has been used recently to explain differences in floras that occur in disparate locations, especially aquatic environments (e.g., Dvořák et al., 2021;Filker et al., 2016;Kociolek et al., 2017;Ribeiro et al., 2018).In addition, describing biogeographic trends can contribute to an increased understanding of microbial dispersal, community structure, and composition (Burgsdorf et al., 2014;Lear et al., 2013).
DNA metabarcoding methods have proven useful to investigate diversity of microbial communities and have been used for exploring biogeographic trends (Antich et al., 2022;Pitz et al., 2020;Šupraha et al., 2022).Advantages of metabarcoding over the traditional light microscopy methods for identifying and quantifying algal communities include cost, reproducibility (Kermarrec et al., 2013), detection of species that may be overlooked in morphological assessments due to their small size or rarity (Pérez-Burillo et al., 2022), clear identification of some groups of microbes (especially when reproductive structures are lacking), and an increasing lack of taxonomists available for this work (Kahlert et al., 2012).Metabarcoding studies have found higher diversity than detected with morphological methods in studies of cyanobacteria (e.g., Li et al., 2019) and diatoms (e.g., Zimmermann et al., 2015).Using a multi-marker approach has revealed increased diversity across a wide range of taxa in bacterial and algal communities (Marcelino & Verbruggen, 2016;Wolf & Vis, 2019).The large subunit of the RUBISCO (rbcL) marker region has the ability to distinguish between diatom species (e.g., Apothéloz-Perret-Gentil et al., 2021;Hamsher et al., 2011), and the universal 16S V4 marker region has proven useful to detect cyanobacteria, other bacterial taxa, and Archaea (Walters et al., 2015).
In contrast to these advantages of metabarcoding, molecular surveys of microbial diversity remain limited by the lack of available reference sequences (Esenkulova et al., 2020;von Wintzingerode et al., 1997) and misidentification of taxa in reference databases (Dvořák et al., 2018;McGovern et al., 2023), issues that must be improved upon by pairing microbial culturing/sequencing efforts with taxonomy to overcome this barrier and make metabarcoding a viable strategy for ecological studies, especially long-term monitoring efforts.
Transcriptomics and proteomics have also been used to assess community composition and processes that community members are undergoing in MIS (Grim et al., 2021(Grim et al., , 2023)).Additionally, groundwater analyses have characterized the aquifer sources of MIS (Grim et al., 2023) and GSS (Haack et al., 2005).These studies have revealed the biogeochemical and metabolic processes occurring in these habitats, particularly in MIS, but a deeper look into the taxonomic composition of these sites and exploration of new spring habitats is merited to better describe these communities and factors that influence them.
For this study, multi-marker metabarcoding analyses were performed targeting bacterial, archaeal, and diatom diversity to investigate the microbial mat communities of five low-oxygen, high-sulfur springs in Michigan and Florida.The main goals of this study were to (1) compare water parameters and microbial mat community diversity between these sites with unique conditions; (2) document undescribed taxonomic composition of microbial mat communities from sulfur spring sites using metabarcoding data supplemented by a culture-based DNA reference library; and (3) explore whether environmental characteristics and/or geographic distance between springs drive any differences observed between these microbial mat communities.

| Sites
Three sites were investigated that lie in a region near Alpena, Michigan, wherein karst geology has led to the formation of numerous sinkholes and springs (Biddanda et al., 2006).MIS (45°11′54.2″N 83°19′30.2″W) is a 23-m-deep sinkhole in Lake Huron where cool (~10°C), high-sulfur (>1000 mg/L), low-oxygen (<1 mg/L) groundwater vents and pools in a basin, creating an isolated environment with unique conditions relative to surrounding waters (Biddanda et al., 2006(Biddanda et al., , 2012)).Similar environments are created in a nearshore shallow spring outlet in El Cajon Bay (ECB, 45°05′07.5″N 83°19′28.3″W) and an artesian well fountain in downtown Alpena (FTN, 45°03′44.9″N 83°25′52.6″W).Groundwater with nearly identical water parameters characterizes these sites and is thought to originate from a shared source (Snider et al., 2017).Differing levels of sunlight and surface water mixing occur at each site.In MIS, microbial mats receive only 5%-10% of the sunlight measured at the lake surface, while shallower mats at ECB (∼0.25-2 m) receive 50%-90% of this light (Biddanda et al., 2015).Mats from two spring habitats in ECB were sampled, a shallow spring at 0.25-0.5 m in depth, and a deeper spring at ~1 m.Each of the three tiers of FTN was sampled, and mat samples were collected from one area of MIS near its source, at a depth of 23 m.
A large sulfur spring sinkhole with a similar carbonate aquifer groundwater source to the Alpena sites, GSS (41°46′04.3″N 83°27′21.7″W), is surrounded by marshland along the shore of Lake Erie's western basin.The aquifer below dissolved the rock layers around it, forming a 13-m-deep sinkhole, wherein groundwater pours into a 42-m-wide, tufa-rimmed pond (Chaudhary et al., 2009;Lundstrom et al., 2004).Spring water flows out of the pond through a culvert, a channel, and eventually emptying into Lake Erie.Mats were collected within the pond near the shoreline, near the spring source at 13 m deep, and at the outlet culvert.
Groundwater flows out of an artesian well into a 4-m-wide bay, where it is contained by a ring of rocks and concrete.Floating microbial mats, benthic mats, and white filaments near the spring outlet were collected at this site.

| Sample collection
Each site was visited in the spring (April-May), summer (June-July), and fall (September) periods.Exceptions include MIS and OAK, which were only sampled during the summer period.During each visit, a YSI multiprobe (Yellow Springs Instruments, Inc., Yellow Springs, OH, USA) was used to measure temperature (Temp), specific conductance (Cond.), and percent dissolved oxygen (ODO.).
Due to multiprobe malfunction, data from a summer 2021 YSI deployment were used to characterize MIS water parameters.In addition to YSI parameters, 250 mL acid-washed Nalgene bottles were used to collect water samples for nutrient analyses at each sampling point.Each water sample was subsampled into two vials, of which one was refrigerated and one was frozen within 24 h of collection.
Mats from wadable sites were collected using a suction device and placed in sterile Whirlpak® bags, and then put on ice for transport to the Annis Water Resources Institute (AWRI, Muskegon, MI, USA).Three replicate mat samples were collected from each habitat type at each site during each sampling event.Mats from MIS were collected by NOAA divers using a coring device and transported to AWRI as intact cores in plastic tubes on ice.Mats from the source of GSS (13 m) were visualized using an Eyoyo underwater camera (Eyoyo Ltd, Shenzen, China) and collected during the fall sampling period using a 15 m aluminum pole with a 20 μm plankton net affixed to the end for gathering intact mats, with the aid of the underwater camera to guide sampling efforts and ensure representative mat sample collection.Plankton tow samples were also collected at GSS and ECB to determine taxa that may be considered part of the surrounding planktonic community, rather than active members of the microbial mat community.Each mat sample collected was subsampled, with one subsample used for generating unialgal cultures and the other for metabarcoding.

| Culture-based DNA reference library
Similar to the strategy employed in Hamsher et al. (2013), individual diatom cells were isolated from each culturing subsample via micropipette serial dilution to establish unialgal cultures.Monocultures were maintained in WC + Si liquid medium (Guillard & Lorenzen, 1972) at 10°C and a 12:12 light cycle.For morphological identification of cultures, live material was boiled in HNO 3 for 1 h, repeatedly washed and settled with ddH 2 O, dried on coverslips, and mounted on slides using Naphrax®.Each culture was identified to species under 1000× using a Nikon Eclipse Ni-U light microscope with DIC and Krammer and Lange-Bertalot (1986, 1988, 1991a, 1991b).When monocultures had grown to a sufficient density for DNA extraction, cells were harvested by centrifugation and a Chelex extraction was performed following Richlen and Barber (2005).The rbcL region of each culture was amplified using primers rbcL66+ (Alverson et al., 2007) and DPrbcL7- (Jones et al., 2005), Cytiva PuReTaq™ Ready-To-Go™ PCR beads (Cytiva, Marlborough, MA, USA), and a thermocycler protocol of 94°C for 3 min 30 s, then 36 cycles of 94°C for 50 s, 52°C for 50 s, 72°C for 80 s, with a final extension at 72°C for 15 min (Stepanek et al., 2015).The PCR products were frozen and sent to Eurofins Scientific (Louisville, Kentucky) for Sanger sequencing using the PCR primers as well as internal primers CfD+ (Hamsher et al., 2011) and rbcL1255- (Alverson et al., 2007).Sequences were assembled, edited, and aligned using Geneious Prime (Version 11.0.15+10).The final alignment of rbcL sequences included data from 43 cultures (~1370 bp with no indels).
To isolate cyanobacterial taxa, mat samples were spread onto solid Z-8 medium (Rippka, 1988) and nitrogen-free Z-8 medium to isolate a wider range of cyanobacteria, and grown under ambient conditions (23°C, ∼16:8 h light: dark photoperiod).Colonies were individually picked and plated until unialgal cultures were achieved.
Morphology of the strains was analyzed via light microscopy (Nikon Eclipse Ni with DIC), and taxonomic identification was assessed using Wehr et al. (2015) and Komárek and Anagnostidis (2005).
Images were taken with a high-resolution camera (Nikon digital sight DS-U3).Direct PCR was performed as follows: cells were placed at −20°C for 30 min, centrifuged, and the supernatant containing DNA collected.The partial 16S rRNA gene (hereafter abbreviated as 16S) and the whole 16S-23S ITS region (Gaylarde et al., 2004) were amplified using primers CYA8F and CYAB23R (Neilan et al., 1997).

| Metabarcoding
Subsamples for metabarcoding were frozen at −80°C within 36 h of collection, except for MIS samples which were stored at 10°C for 72 h prior to harvesting, then frozen at −80°C, due to logistical limitations.DNA was extracted from the metabarcoding subsamples using the Qiagen PowerSoil DNA Extraction Kit (Qiagen, Crawley, UK) according to the manufacturer's protocol, with a negative control consisting of autoclaved nanopore water included for each subset of extractions and for each primer to assess potential processing contamination.To prepare samples for Illumina amplicon sequencing, a two-step PCR approach was employed.The initial PCR was completed to amplify the two barcode markers (rbcL and 16S) in individual reactions using specific primers with the attached Illumina adapter.The primary PCR amplification was completed in 25 μL reactions using 12.5 μL of Q5 High-Fidelity2X Master Mix (New England BioLabs Inc., Ipswich, MA, USA), 1.0 μL of each primer (1 μM), 9.5 μL RNase-free H2O, and 1 μL DNA.For the 16S marker, the primer pair and thermocycler protocol from Walters et al. (2015) were employed.For the rbcL marker, we targeted a 312 bp region of the rbcL plastid gene using an equimolar mix of the three forward and two reverse degenerate primers from Vasselon et al. (2017), along with their thermocycler protocol.
Following PCR amplification, samples were sent to the University of Tennessee, Knoxville, for processing and sequencing.PCR products were cleaned with Agencourt AmPure XP beads (Beckman Coulter Inc., Indianapolis, IN, USA) and quantified using a Qubit Fluorometer (v.2.0; ThermoFisher Scientific, Waltham, MA, USA).Samples were normalized, and a second PCR reaction (50 μL) enriched with Q5 High-Fidelity 2X Master Mix was performed to apply indexing primers, following cycling conditions: 95°C for 3 min followed by 10 cycles of 95°C for 30 s, 55°C for 30 s, 72°C for 30 s, with a final extension of 72°C for 5 min, modified from the 16S protocol (Illumina, 2013).A second PCR clean-up was performed, and samples were quantified using a Qubit Fluorometer.Libraries were loaded with 25% PhiX clustering control on the Illumina MiSeq platform for 300 bp × 2 pairedend reads using the V3 kit.
The resulting sequence datasets were analyzed separately for each marker region.Sequences were demultiplexed and adapters were removed.Primers were trimmed using Cutadapt version 4.2 (Martin, 2011).Using the DADA2 pipeline (Callahan et al., 2016), reads were quality filtered based on Q30 scores and trimmed to remove low-quality reads.Filtered reads were denoised and dereplicated using DADA2 to produce amplicon sequence variants (ASVs).
Singletons, doubletons, and chimeric sequences were removed from the dataset.ASVs identified as chloroplast or mitochondria in the 16S dataset were removed.The SILVA database (release 138.1, Quast et al., 2013) appended with CyanoSeq (Lefler et al., 2023) was employed to assign taxonomy to the 16S ASVs.For the rbcL dataset, taxonomy was assigned using the curated reference database Diat.barcode (Rimet et al., 2019).For both datasets, ASVs matching our culture-generated sequences were assigned to the taxa we identified them as, and reference taxonomy assignment (from SILVA/ CyanoSeq or Diat.barcode) was replaced if taxonomy assignment differed.Only ASVs assigned to diatom taxa were kept for the rbcL marker.Two genera found to dominate the plankton tow samples, Cyclotella and Lindavia, were removed from the rbcL data analyses because they are planktonic taxa and unlikely to be active members of the mat community.

| Statistics
RStudio (v4.4.4;R Core Team, 2022) was used for statistical analyses of the resulting water parameters and metabarcoding data.Water parameters were Tukey-transformed prior to statistical comparisons.Measures that fell below the detection limit were included as zeros in statistical analyses.All variables were tested for normality and homoscedasticity using Shapiro tests and Bartlett tests, respectively, with the vegan package (v2.6.4;Oksanen et al., 2022).To compare water parameters between sites, Welch analysis of variance (ANOVAs) and Games-Howell post-hoc comparisons were run using the vegan package (v2.6.4;Oksanen et al., 2022).Kruskal-Wallis rank-sum tests were used for water parameters with non-normal distributions (conductivity, dissolved oxygen, nitrate) using the | 5 of 35 agricolae package (v1.3.5;de Mendiburu, 2021), and post-hoc Dunn tests were run using the FSA package (v0.9.4; Ogle et al., 2023).
Statistical analyses of diversity were performed separately for each molecular marker (16S and rbcL).Observed (ASV richness) and Shannon alpha diversity metrics were calculated for each site using the phyloseq package (v1.42.0;McMurdie & Holmes, 2013).

Microbial community composition was compared by generat-
ing Bray-Curtis community dissimilarity matrices for each sample and running a permutational analysis of variance (PERMANOVA) test to investigate differences between sites using the microViz package (v0.10.7;Barnett et al., 2021).A post-hoc pairwise PERMANOVA was run to determine whether sites differed from one another using the pairwiseAdonis package (v0.4.1;Martinez Arbizu, 2020).To investigate the influence of environmental parameters on community composition, significance of variables was tested using the function envfit of the vegan package (v2.6.4;Oksanen et al., 2022), which through multiple regression indicated that all variables were significantly related to the ordination axes (p < .05).An RDA ordination for each marker was plotted with these variables along with taxa that contributed most to the axes using the microViz package (v0.10.7;Barnett et al., 2021).
Spearman's rank correlation coefficients were calculated to assess correlations between water parameters and taxa using the ggpubr package (v0.6.0;Kassambara, 2023).To further explore environmental influences and compare them to the effects of geographic distance between sites, Mantel tests were performed on the environmental data, geographic distances generated using the geosphere package (v1.5.18;Hijmans, 2022), and Bray-Curtis community dissimilarity matrices to determine the significance and relative influence of these variables with the vegan package (v2.6.4;Oksanen et al., 2022).To visualize taxonomic composition of OAK and GSS, heatmaps were generated using Hellingertransformed relative abundances of taxa with >5% prevalence using the microViz package (v0.10.7;Barnett et al., 2021).

| Water parameters
Water parameters measured varied statistically between sites (Table 1), except for percent dissolved oxygen (H 4 = 8.79, p = .067).showed a gradient of concentrations, from highest at OAK, to intermediate at FTN and GSS, to lowest at ECB (F 3,15 = 17.9, p < .001).
Nitrate concentrations were higher at OAK and ECB than at the other sites (H 4 = 12.3, p = .020),with FTN, GSS, and MIS samples never exceeding the detection limit (0.01 mg/L).No soluble reactive phosphorus (SRP) concentrations were found above the detection limit (0.005 mg/L).

| Microbial mats
Microbial mat growth was found at all sites but was limited during the spring collection period at ECB.At GSS, underwater photography was used to observe microbial mat growth near its source at 13 m depth.The camera revealed lawn-like, purple microbial mat growth in the area surrounding the outlet at GSS, with finger-like structures created by gases underneath the mat, a macroscopically similar community to those documented at MIS (Figure 1; Biddanda et al., 2015).Some mats found at ECB and FTN were also purple, but the FTN mats were notably thicker and included more white filamentous growth.Mats at OAK appeared largely composed of filamentous white bacteria, with floating mats showing a mixture of purple, gray, and green coloration macroscopically.

| Metabarcoding
The  and 3, respectively.Overall, observed alpha diversity was an order of magnitude higher for the 16S than the rbcL dataset (Figure 2).TA B L E 2 Bacteria genera/identifiers present (X) or absent (−) in each site.

| Diversity
| 11 of 35 | 13 of 35 | 15 of 35 | 17 of 35 The environmental variables for both 16S and rbcL indicate that increased pH may be contributing to the unique microbial community in MIS.
Mantel tests found that the environmental variables measured were significantly correlated (p < .05)with the community distance matrices for both the 16S and rbcL datasets (Table 4).A matrix of all environmental variables measured (AllEnv) was tested against a matrix of geographic distances to determine the importance of biogeography and local conditions to the changes in the community matrix.For both markers, both the geographic distance and the environmental variables showed significant correlation with the community matrix, with the geographic correlations (16S: r = .5331,rbcL: r = .6022)being slightly stronger than the environmental correlations (16S: r = .3894,rbcL: r = .4621).

| Community composition
The most abundant taxa at each site are shown by Hellingertransformed relative abundances in heatmaps (Figure 4).

| Water parameters
Despite a common groundwater aquifer source providing a constant flow of compositionally similar water (Snider et al., 2017), conditions at MIS, ECB, and FTN differed in temperature, pH, and chloride concentrations.A main driver of this habitat variety may be mixing with surface water, which is nonexistent at FTN, limited at MIS (Ruberg et al., 2008), and constant at ECB (Snider et al., 2017).Low-oxygen, high-sulfur conditions at these springs contrast with the surrounding lake waters bordering MIS, ECB, and GSS, where percent dissolved oxygen levels approach complete saturation and sulfate concentrations are below 40 mg/L (Biddanda et al., 2012;Haack et al., 2005).Percent dissolved oxygen of most samples approached or exceeded the threshold for hypoxia (30%, Steckbauer et al., 2011), while the values recorded from MIS and the GSS source approached anoxic conditions.
While pH differed between MIS and GSS, these sites were similar in conductivity, percent dissolved oxygen, nitrate, and sulfate concentrations, presenting comparable unique conditions for mat communities at both locations.Despite these similarities, a direct comparison of these mat samples revealed significantly different microbial communities for both the 16S and rbcL markers (PERMANOVA, p < .001).pH differed between MIS and GSS.Depth is another factor that differentiates these

| Diversity
Our metabarcoding approach revealed high levels of bacterial diversity in MIS.Kinsman-Costello et al. (2017) also reported high bacterial diversity from MIS, but differences in sample processing and data analyses prevent a direct comparison of alpha diversity.Our study also revealed a diverse microbial community in GSS, which had not been previously investigated with high-throughput sequencing techniques but had been documented to contain cyanobacteria, sulfurmetabolizing bacteria, and Archaea using clone libraries (Chaudhary et al., 2009).Additionally, few explorations of eukaryotic diversity have occurred at these sites (except Nold, Pangborn, et al., 2010), and our study presents the first targeted survey of diatom diversity at MIS, ECB, FTN, GSS, and OAK.Distinct bacterial and diatom communities were found at each site, despite the shared groundwater sources and geographic proximity between some sites (e.g., <20 km between FTN, ECB, and MIS).These sulfur spring sites presented a range of habitats.ECB and GSS have increased habitat complexity, which has been correlated with increased diversity of freshwater benthic microbial communities (Levi et al., 2017;Singer et al., 2010).
Higher nitrate concentrations at ECB could also contribute to more algal taxa inhabiting the site.However, low nutrients in the water measured at MIS may not result in limitation for microbes, as the sediment beneath the microbial mat is known to accumulate organic ma- Influence from surrounding surface waters at sites with higher levels of surface mixing such as ECB could also lead to higher diversity values due to increased dispersal of free-floating microbes.Microbes from surrounding waters were undoubtably collected within our microbial mat samples, but our plankton tows allowed us to eliminate some of this suspended community from our analyses.Plankton tow samples were composed mainly of Lindavia (Malik & Saros, 2016) and Cyclotella (Saros & Anderson, 2015), taxa that are commonly found in the water column, justifying their removal from the analyses.While dispersal abilities of suspended microbes present an unavoidable issue when characterizing benthic microbial community composition, using plankton tow sampling to eliminate taxa from further analyses of benthic communities can increase accuracy, particularly for groups such as diatoms where growth habits are well established.However, since planktonic and benthic algal communities influence each other (Stevenson et al., 1996), the amount of settled cells in a benthic community could have impact on its structure and function.The diversity of diatom taxa we found within these microbial mat communities presents the need for more research on eukaryotic mat community members, and the roles they may play in these mats.

| Factors contributing to community differences
The stark difference between OAK mat communities and other sites was strongly associated with temperature, conductivity, and chloride concentrations.MIS was associated with higher pH than the other sites.The sites differentiated more distinctly by environmental variables in the rbcL dataset than 16S, indicating that these variables may influence diatom communities more strongly than bacteria.This could also be due to the increased taxonomic resolution we were able to use (order for 16S vs. genus for rbcL).The rbcL ordination showed positive dissolved oxygen and sulfate gradients associated with GSS that were not observed in the 16S ordination, suggesting that the bacterial communities at GSS may be influenced by variables that were not measured.These trends in community dissimilarity indicate that environmental variables may vary strongly in their influence on different members of mat communities (e.g., Lu et al., 2023).Metabarcoding studies have been useful for exploring biogeographic patterns of diversity and taxonomy in bacteria (Varliero et al., 2023) and provide an opportunity to develop large datasets describing microbial communities that can be used with environmental and geographic variables to determine factors influencing these communities.Despite significant effects of the environmental variables driving the diatom and bacterial community composition of these microbial mats, the variables measured explained a low percentage of variance.
Geographic distance showed a significant correlation with differences in community composition in this study.Major barriers to dispersion exist between these isolated sites.While cosmopolitan species may travel through the Great Lakes, these isolated habitats are unlikely to be reached by microbes specializing in high-sulfur, low-oxygen conditions.Groundwater is a likely source for some of the microbes in these communities, particularly bacteria.Further studies of groundwater aquifer biodiversity, along with exploration of the evolutionary history of taxa in these isolated spring ecosystems, could help answer important questions about dispersal and its role in the microbial biogeography of these

| Community composition
As expected, cyanobacterial ASVs were abundant in the 16Sgenerated community.Interestingly, macroscopically similar purplecolored cyanobacterial mats observed in GSS, MIS, and ECB did not result in similar bacterial beta diversity.The Synechococcales and Pseudanabaenales (Cyanobacteria), along with the Beggiatoales (sulfur-oxidizing Bacteria), were associated with low dissolved oxygen in the RDA ordination.The Holophagales, a rare and poorly described anaerobic group (Anderson et al., 2011), contributed to separation between the MIS and GSS sites.Despite being dominated by cyanobacteria, differences in other Bacteria and Archaea may drive significant differences in community composition in microbial mats.Our 16S primers amplified some archaeal taxa, including the ammonium-oxidizing order Nitrososphaerales (Könneke et al., 2005), which was associated with high dissolved oxygen and GSS sites.
While archaeal diversity is poorly understood (Adam et al., 2017) and universal 16S primers may be unable to detect many Archaea (Eloe-Fadrosh et al., 2016), these primers may allow for limited quantification of archaeal communities (Fadeev et al., 2021).Development of Archaea-specific primers for metabarcoding may be required to better understand their diversity and functional role at these sites.
The rbcL marker revealed a diverse array of diatom taxa with high taxonomic resolution.This study adds to previous metabarcoding research that has used the rbcL marker to successfully characterize algal communities (e.g., Fawley et al., 2021;Pérez-Burillo et al., 2022;Wolf & Vis, 2019).Culturing diatoms from our samples proved to be an important and successful method to improve the accuracy of taxonomic assignment.For the Michigan sites, almost half the reads generated (42.6%) were represented in our culturing efforts, increasing our confidence in the taxonomy assignment for these diatom communities.In total, 119 taxonomy assignments conflicted with species-level assignments in Diat.barcode, suggesting that regional differences between our sites and those used to create the reference database (i.e., Michigan and Florida vs primarily European taxa) could lead to discrepancies, and further stressing the value of incorporating a culture-based DNA reference library into metabarcoding studies.Species-level taxonomy assignment is difficult even with sufficient reference information due to cryptic and unresolved species complexes, which can be found in ubiquitous groups such as Fragilaria (Van de Vijver et al., 2022) and Nitzschia (Rimet et al., 2014).Additionally, we found morphologically distinct Cymbella isolates share identical rbcL sequences for the metabarcoding region, suggesting the short (312 bp) rbcL region may be too conserved for species-level identification in some genera.While genus-level identification can provide sufficient resolution for accurate biomonitoring (Rimet & Bouchez, 2012), species-level identification is a component of many biomonitoring programs because species within the same genus may differ widely in responses to water quality (e.g., Ponader & Potapova, 2007), especially for large, diverse genera with many species (Lowe, 1974) such as Navicula (Reavie et al., 2006) and Nitzschia (Hamsher et al., 2004).
Our culturing efforts yielded a wide variety of diatom genera on WC medium, but no taxa from OAK were cultured successfully, indicating a mismatch between our medium and site conditions that may be overcome with a more rigorous culturing effort.Cyanobacterial culturing success was limited mainly to Anagnostidanema and Microcoleus but still contributed valuable reference information for taxonomic assignment.A strategic culturing effort pairs well with a metabarcoding survey to characterize microbial communities and could be strengthened further if the use of longer marker regions is made possible by future sequencing technology.
Most of the dominant diatoms found in these microbial mat communities represented benthic, motile groups.Biddanda et al. (2023) noted that the mass vertical microbial migration of microbial mats occurs at a small scale but may have large impacts on metabolic processes in the mat.Motile diatoms may actively participate in this, such as Craticula optimizing nitrogen respiration in low light conditions (Merz et al., 2021).While the focus of most microbial mat research has remained on cyanobacteria due to their conspicuity and abundance, diatoms may also serve an important role.Future studies should investigate other motile diatoms in microbial mat communities to see if they may share this unique metabolic strategy, or partake in another.
Our study presents the first diatom surveys performed at these sites.GSS microbial mats were dominated by Navicula oblonga.This taxon may occupy a similar role in the mat community to Craticula cuspidata that dominate MIS, as both are motile taxa with similar autecology (Lowe, 1974).The presence and relatively large cell size (>100 μm) of Navicula oblonga in GSS samples could also contribute to an increased number of reads for each individual and overrepresent the abundance of Navicula in our analyses, an issue that may be resolved by developing correction factors for such taxa (Vasselon et al., 2018).Nitzschia were found in all site groups, and their role in microbial mat communities merits further investigation.Additionally, cryptic diversity within the Nitzschia palea species complex was noted in our cultured sequences (data not shown), and these isolated sites could provide further insight into the evolutionary history of this taxon.OAK diatom communities were dominated by Hyalosynedra, characterized as a benthic marine genus (Belando et al., 2018).This was surprising in a groundwater-fed habitat isolated from marine surface waters, although conductivity and chloride measures suggested that water conditions at OAK could be considered brackish (Remane & Schlieper, 1971).
An issue with using DNA to characterize or explore algal communities is the persistence of DNA in water.Environmental DNA may persist long enough to be transported in the water column and consequently be detected at locations where the organism has not actually been present (Carraro et al., 2018;Shogren et al., 2018).
Several studies show that eDNA persistence in water may reach 4 weeks, but most degradation occurs within the first few days (Collins et al., 2018;Lance et al., 2017;Strickler et al., 2015;Tsuji et al., 2017;Weltz et al., 2017).In contrast, eDNA in sediment and biofilms has been known to persist for longer time periods (Corinaldesi et al., 2005;Domaizon et al., 2017).While seasonal and geographic variations should be considered, 16S rRNA marker genes for Bacteriodes have been shown to persist in water for over a week when held at 10°C (Okabe & Shimazu, 2007).Logistical limitations due to difficult site access required MIS samples to remain at 10°C for 72 h.Aforementioned studies of aquatic degradation of DNA suggest that despite some sample processing limitations, our results should be considered reliable and reasonable sample processing times may be allowed for this type of study, with caution.
For both 16S and rbcL, FTN had relatively low diversity and ECB had relatively high diversity.At OAK, the relatively high diversity of the 16S observed and Shannon diversity were contrasted by low rbcL diversity.Beta diversity between sites was significantly different for both 16S (F 4,65 = 6.72, p < .001)and rbcL markers (F 4,78 = 22.93, p < .001).A pairwise PERMANOVA post-hoc test revealed that all sites differed from each other for each marker (p < .001for all pairwise comparisons).These site differences are presented in the clustering of samples by site in RDA ordinations (Figure 3).F I G U R E 1 Comparison of underwater imagery of microbial mats found in: (a) = Middle Island Sinkhole, Alpena, MI (Rob Paddock, University of Wisconsin), (b) = Great Sulfur Spring, Erie, MI.
For the cyanobacteria, Planktothrix and Limnothrix dominated GSS samples, while FTN, ECB, and MIS were composed primarily of Microcoleus.Thiothrix, a genus of sulfur-oxidizing bacteria, was abundant in some samples from each of the sites, except for MIS.The MIS bacterial community was more diverse and less dominated by a single genus with Beggiatoa and Rhodoferax at higher abundance.For the diatom community, a variety of genera contributed to the abundance in each sample.Most samples were dominated by the speciose Navicula and Nitzschia, except for MIS.MIS contained primarily Craticula and Staurosira, whereas OAK was dominated by Brachysira, Halamphora, and the marine genera Hyalosynedra and Envekadea.F I G U R E 2 Boxplots representing ASV richness and Shannon alpha diversity metrics for the 16S (a, b) and rbcL (c, d) datasets at each site.Sites sharing a capital letter are not significantly different as determined by one-way ANOVAs with Tukey post-hoc tests (p < .05).Thick black lines within the box represent median values, boxes represent the interquartile range, and whiskers and points represent the range, with points outside the whiskers representing outliers (values over or under 1.5 times the interquartile range).ECB, El Cajon Bay; FTN, Alpena Fountain; GSS, Great Sulfur Spring; MIS, Middle Island Sinkhole; OAK, Florida Oak Spring.

F I G U R E 3
Redundancy analysis (RDA) ordinations showing relationships between environmental variables and taxa explained by the axes for the 16S (a) and rbcL (b) datasets, respectively.Variables include percent dissolved oxygen (ODO.), temperature (Temp), specific conductivity (Cond.),pH, sulfate (SO 4 .mg.L), and chloride (Cl.mg.L).two sites (MIS = 23 m; GSS = 13 m), with light availability for photoautotrophs more limited in MIS than in GSS.Similar water parameters were found at GSS to those recorded previously, except for pH, which was 6.4 in Chaudhary et al. (2009) and 7.35 in our study.OAK differed from other sites in temperature, conductivity, and chloride, factors related to its warmer climate and proximity to the Atlantic Ocean.Analyses of the main salts contributing to high chloride concentrations (e.g., NaCl, KCl, MgCl) would provide more insight into the causes of high chloride in the groundwater at these sites.OAK had low dissolved oxygen (similar to the Michigan sites) and had sulfate concentrations similar to the Alpena, Michigan sites (FTN, ECB, and MIS).
terial and promote nutrient flux to surface mats(Kinsman-Costello et al., 2017).Measurements of flux at the sediment-water interface would be useful to compare nutrient availability as a contribution to microbial diversity at other sites in the future.The rbcL dataset showed significantly higher Shannon diversity at ECB and OAK than other sites, while GSS showed intermediate diversity values.

F I G U R E 4
Heatmaps displaying bacterial/archaeal genera (a, 16S) and diatom genera (b, rbcL) that composed of the highest relative abundances of samples from each site.Color across the top of each plot indicates site, and color of each box indicates Hellinger-transformed relative abundance.The number of ASVs assigned to each group is listed in parentheses next to each taxon.
Mantel test results for environmental variables and geographic distance.
TA B L E 4